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(54) Method and apparatus for simplifying the access of metadata 



(57) Available storage media capacity for personal 
video recording increases continuously, metadata can 
be used to organize the recordings, search for content 
and access specific recordings. If metadata are embed- 
ded within the multimedia content itself, like DVB spe- 
cific Service Information, which are multiplexed with the 
audio and video streams to form a MPEG-2 transport 
stream, a search based on this metadata would require 
an inefficient and time consuming search through all 



multimedia content stored. According to the invention 
metadata information is gathered, analyzed and proc- 
essed to form metadata entities, which are amended by 
a reference to the content itself. A descriptor stream 
(DS) is formed from the resulting pairs of metadata en- 
tities and references to the content and is stored sepa- 
rately from the files comprising multimedia content. In 
this way, for data of an MPEG-2 transport stream the 
metadata can be accessed without a need to reparse 
the entire stream. 




Printed by Jouve, 75001 PARIS (FR) 



1 



EP 1 271 537 A1 



2 



Description 

[0001] The invention relates to a method and to an 
apparatus for simplifying the access of metadata, which 
are associated with a file comprising multimedia data or 
a part of said file, especially for describing the content 
of said multimedia data and/or searching said file or file 
part among a plurality of files comprising multimediada- 
ta, wherein the metadata are originally multiplexed with 
said multimedia data. 

Background 

[0002] Available storage media capacity for personal 
video recording increases continuously, approximately 
doubling every 2 years. Currently it is possible to store 
about 20 full-length movies on a single 1 00 GByte hard 
disk. In 2005, it will likely be possible to store about 80 
movies on a single 400 GByte hard disk. 
[0003] Similar figures apply to optical recording: cur- 
rently about 5 GByte can be stored on a single layer sin- 
gle sided DVD disc but the DVR recorder as a successor 
of the today's DVD recorder will allow storage of up to 
35 GBytes on a corresponding disc. Furthermore, two 
or even more layers can be used per side and these can 
be applied to both sides of the disc. Finally, it is possible 
to combine several discs in a special magazine. 
[0004] This enormous amount of data requires new 
ways to organize the recordings, search for content and 
access specific recordings, because it is no longer pos- 
sible to find recordings in a user's bookshelf by just look- 
ing at the video cassettes/discs and some annotations 
on their cover. One possible solution for this is to use 
so-called metadata, defined as data about data, for the 
recorded content. 

[0005] Metadata can be embedded within the multi- 
media content itself. For example, the MPEG-2 systems 
standard as specified in ISO/I EC 13818-1 defines pro- 
gram specific information (PSI) which is multiplexed with 
the audio and video streams. Similarly, the DVB stand- 
ard used for the transmission of digital television signals 
specifies Service Information (DVB-SI) included in a 
DVB compliant MPEG-2 transport stream multiplex. 
[0006] Cecarelli et al. : "Home multimedia systems: on 
personal video libraries", MULTIMEDIA COMPUTING 
AND SYSTEMS, 1999, IEEE INTERNATIONAL CON- 
FERENCE IN FLORENCE, ITALY, 7-11 JUNE 1999, 
LOS ALAMITOS, CA, USA, IEEE COM PUT. SOC, US, 
7 June 1999, pages 1082-1085, XP01 0342599, ISBN: 
0-7695-0253-9" describe a system, where Metadata are 
extracted from the multimedia content and are stored 
separately from the multimedia content in a Multimedia 
Database Management System (MM-DBMS). The de- 
scribed system stores the multimedia content on tape 
and stores the database of the MM-DBMS on hard disk. 
This approach is targeting a hard disk based archive 
system, where the extracted Metadata always stays 
within the device and is not intended for metadata ex- 



change by means of removable media, like it is required 
for optical recording. 

Invention 

5 

[0007] The invention is based on the recognition of the 
following fact. Given the availability of metadata multi- 
plexed into the multimedia content itself it is possible to 
access the metadata directly from the bitstream, like the 
DVB-SI information directly from the MPEG-2 transport 
stream. However, for recorded data like a broadcasted 
DVB television signal which is recorded on a disc after 
reception, a search based on these metadata would re- 
quire a full search through all multimedia content stored 
in order to collect that metadata. This is both inefficient 
and time consuming. 

[0008] Therefore, a problem to be solved by the in- 
vention is to make metadata information multiplexed in- 
to the multimedia content itself more easily available for 
automatic or electronic access, in particular for metada- 
ta based searches, browsing or presentation engines. 
This problem is solved by the method disclosed in claim 
1 . An apparatus that utilizes this method is disclosed in 
claim 8. 

[0009] According to the invention the metadata are 
extracted from the multimedia content multiplex. The ex- 
tracted metadata are gathered and analyzed to form 
metadata entities, which are amended by a reference to 
the content itself. A descriptor stream is formed from the 
resulting pairs of metadata entities and references to the 
content and is stored separately from the files compris- 
ing multimedia content. 

[0010] In this way the metadata attached to the mul- 
timedia content allow efficient and fast automatic con- 
tent referencing, content location and automatic or elec- 
tronic access. 

[001 1 ] Advantageously, the invention can be used for 
accessing metadata addressing a file or parts of a file 
recorded on a storage medium. In this case, processing 
the metadata is performed during a recording process 
of the files comprising multimedia content. Especially, 
for data of a recorded M PEG-2 transport stream this al- 
lows to access the metadata without a need to reparse 
the entire stream. 

[0012] The processing of the metadata can be per- 
formed during the recording process of the files or file 
parts. This has the advantage that the metadata are im- 
mediately available for metadata based searches. 
[0013] However, it can also by advantageous to per- 
form the processing of the metadata in an offline pass 
after the recording process, e.g. if an MPEG transport 
stream is recorded as it is without demultiplexing of the 
elementary streams. 

[0014] Furthermore, it can be advantageous to com- 
plete the metadata extracted from the multimedia con- 
tent multiplex by metadata retrieved from another 
source, e.g. by metadata transmitted by a service pro- 
vider via internet. 
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[001 5] Also the metadata extracted from the multime- 
dia content multiplex can be supplemented by inputs 
from the user, e.g. using a keyboard. This allows the 
user to make personal annotations. 
[0016] Further advantageous embodiments of the in- 
vention result from the following description. 

Drawing 

[0017] Exemplary embodiments of the invention are 
described with reference to the accompanying drawing, 
which shows in: 

Fig. 1 the processing of a separate descriptor stream 
comprising metadata. 

Exemplary embodiments 

[0018] Exemplary embodiments of the invention are 
described in the following. Although the further descrip- 
tion concentrates on the processing of an MPEG-2 
transport stream, most embodiments can easily be gen- 
eralised for use in any kind of multiplexed bitstreams 
comprising metadata. 

[0019] In Fig. 1 a DVB compliant MPEG-2 transport 
stream DVBTS containing multimedia data and DVB-SI 
data represents the multimedia content multiplex. The 
multimedia data can comprise arbitrary data, but espe- 
cially includes video and audio data. The DVB-SI data 
consists of metadata carrying Descriptors that are en- 
capsulated into SI sections and SI tables and may 
stretch across multiple MPEG-2 transport packets, that 
are not necessarily consecutive inside the transport 
stream multiplex. For further details reference is made 
to the MPEG-2 systems standard ISO/IEC 13818-1. 
[0020] The recording engine RE collects all the data 
bytes that belong to a given DVB-SI Descriptor from the 
MPEG-2 transport packets and it also memorizes a po- 
sition reference inside the MPEG-2 transport stream 
where the DVB-SI Descriptor did become valid. Both the 
Descriptor data and the start position reference are 
stored. From time to time a Descriptor is collected that 
is meant as a replacement (update) for a Descriptor that 
has already been found in the same MPEG-2 transport 
stream before. This means that the previous descriptor 
becomes invalid. The recording engine then stores the 
end position reference alongside the already stored 
start position reference of the previous DVB-SI Descrip- 
tor. At the end of the MPEG-2 transport stream, the re- 
cording engine checks all stored DVB-SI Descriptors 
and stores an end position to every Descriptor that didn't 
become invalidated so far. The start position reference 
and end position reference as well as a reference to the 
stored MPEG-2 transport stream itself form a so-called 
Content Reference or a Content Locator. All pairs of De- 
scriptor and Content Reference are arranged to form a 
Descriptor Stream DS, which is stored by the storage 
system SS separately from the MPEG-2 transport 



stream DVBTS. For this purpose arbitrary storage sys- 
tems can be used, e.g. optical storage devices or hard 
disk drives. Usually, both the Descriptor Stream DS and 
the MPEG-2 transport stream DVBTS are stored in re- 
5 spective separate files DSF, DVBTSF on the same stor- 
age medium. However, for some applications it is also 
useful to store them on different storage media. 
[0021] The Descriptor Stream can later be amended 
by any kind of Descriptor and Content Reference pairs. 
Other sources OMS than the MPEG-2 transport stream 
DVBTS can be used for retrieving the metadata. Espe- 
cially, the metadata can be generated by automatic fea- 
ture extraction, symbolized by the broken arrow in the 
figure, or, the metadata can be downloaded from the In- 
ternet. User annotations UA can be added as well, using 
the user interface Ul, which may comprise a graphical 
display and some manual input means like a remote 
control or a keyboard or some speech input means. The 
user interface Ul can also be used to launch a metadata 
query MQ, e.g. for accessing a certain multimedia file 
or scene included in the stored DVB transport stream 
files. 

The result of the metadata query, i.e. the corresponding 
Descriptor and Content Reference pairs, is given back 
to the user interface Ul, especially, if the query results 
in more than one hit. For informing the user about the 
query result a corresponding display, e.g. showing a ta- 
ble of found files, or a speech output may be used. After 
the user chooses one among several found files, the 
Content Reference of the selected file is supplied to the 
playback engine PE for playback of the DVB transport 
stream comprising the requested file described by the 
Content Reference CR. 

However, if as a response to a query only a single file is 
found, the Content Reference CR and the respective 
DVB transport stream comprising the found file can also 
directly be supplied to the playback engine PE skipping 
the user selection process. 

[0022] Instead of amending complete Descriptor and 
Content Reference pairs, it is also possible to update, 
modify or replace either a Descriptor or a Content Ref- 
erence exclusively. 

[0023] A Descriptor Stream may also be generated by 
a process completely independent from the recording 
engine described above. If the multimedia content does 
not carry embedded metadata, it would also be possible 
to store a Descriptor Stream in the same format, but the 
pairs of Descriptor and Content Reference are generat- 
ed from out-of-band data (e.g. user annotations, internet 
downloads, feature extraction). 

[0024] The Descriptors in the Descriptor Stream may 
also be stored in a different encoding. For instance it is 
beneficial to transcode DVB-SI Descriptors from their bi- 
nary encoding into an XML encoding. Other transport or 
storage encodings may exist. 

[0025] For some DVB-SI Descriptors (e.g. EPG data) 
it is important to know, from what table or context they 
have been extracted from. In such cases it is beneficial 
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to store such context information together with the De- 
scriptor, Content Reference pairs. 
[0026] The invention includes the following advantag- 
es: 

[0027] The separately stored Descriptor Stream al- 
lows for an easy and fast access to the metadata infor- 
mation by a metadata based search, browsing or pres- 
entation engine. 

[0028] The separately stored Descriptor Stream al- 
lows for an easy and fast access to the metadata infor- 
mation by a metadata based search, browsing and pres- 
entation engine. 

[0029] The separate Descriptor Stream can be stored 
on the same disc as the multimedia multiplex. If the disc 
is an exchangeable media (like e.g. an optical disc), the 
extracted metadata stored in the Descriptor Stream be- 
comes exchangeable together with the stored multime- 
dia content. That means the extracted metadata and the 
multimedia content form an exchangeable bundle. 
[0030] In addition to, or instead of, the storage on the 
same disc, the separate Descriptor Stream can also be 
stored on a different disc or multiple different discs, to 
allow for the exchange of the extracted metadata. This 
is beneficial for archive functionality and for other kinds 
of metadata processing. 

[0031] Compared to a system based on a Multimedia 
Database Management System (MM-DBMS) as de- 
scribed by Cecarelli et al., the proposed direct storage 
of a Descriptor Stream during recording offers the fol- 
lowing advantages. It is much less performance con- 
suming, e. g. in view of real-time constraints, compared 
to the insertion and indexing overhead that is typically 
involved by a MM-DBMS insert operation. If the data- 
base of the MM-DBMS would need to be stored on an 
exchangeable medium, the database import operations 
during disc insertion and the database export operations 
during disc eject become prohibitive. In other words, 
such a known MM-DBMS would not be suited for ex- 
changeable media. 

[0032] DVB-SI Descriptors become valid within the 
multimedia multiplex as soon as they are transmitted. 
They are either invalidated by the transmission of a De- 
scriptor of the same Descriptor type but with different 
values, or, by the end of the transmission. Having a De- 
scriptor Stream allows for the addition of validity infor- 
mation (start, end) that is more convenient to use. 
[0033] Descriptors from different origins may come in 
different encodings. The storage of a separate Descrip- 
tor Stream offers a way to have a unified encoding for 
the Descriptors (e.g. XML). 

[0034] Descriptors may have been generated by an 
offline process or transmitted as out of band data. A sep- 
arate Descriptor Stream offers a possibility to store all 
these Descriptors together. 

[0035] The invention is applicable to all kinds of elec- 
tronic multimedia content referencing and content loca- 
tion, for instance in connection with DVR standardisa- 
tion, metadata, Content Referencing, Content Location, 



Personal Video Recorder, Personal Digital Recorder, 
Optical Storage, Hard Disk Storage, Home Server, and 
Web Enabled Storage. 

5 

Claims 

1. Method for simplifying the access of metadata, 
which are associated with a file comprising multi- 

10 media data or a part of said file, especially for de- 
scribing the content of said multimedia data and/or 
searching said file or file part among a plurality of 
files comprising multimedia data, wherein the meta- 
data are originally multiplexed with said multimedia 

15 data, characterized in extracting the metadata 
from the multimedia content multiplex (DVBTS) ; 
gathering the extracted metadata; 
analyzing the gathered metadata to form metadata 
entities; 

20 amending the metadata entities by a reference to 
the content itself; 

forming a descriptor stream (DS) from the resulting 
pairs of metadata entities and references to the con- 
tent; storing (SS) said descriptor stream (DSF) sep- 
25 arately from the files comprising multimedia con- 
tent. 

2. Method according to claim 1 , wherein the files or file 
parts comprising multimedia content are recorded 

30 on a storage medium (SS) and wherein the meta- 
data are used for addressing said recorded files or 
file parts (DVBTSF). 

3. Method according to claim 2, wherein the process- 
es jng of the metadata is performed during the record- 
ing process of the files or file parts. 

4. Method according to claim 2, wherein the process- 
ing of the metadata is performed in an offline pass 

40 after the recording process. 

5. Method according to one of claims 1 to 4, wherein 
the metadata extracted from the multimedia content 
multiplex are completed by metadata retrieved from 

45 another source (OMS). 

6. Method according to one of claims 1 to 4, wherein 
the metadata extracted from the multimedia content 
multiplex are supplemented by inputs from the user, 

50 e.g. using a keyboard (Ul). 

7. Method according to one of claims 1 to 6, wherein 
said multiplex of multimedia data and metadata cor- 
responds to a DVB compliant MPEG-2 transport 

55 stream (DVBTS) and wherein the metadata multi- 
plexed into the multimedia content corresponds to 
the DVB-SI information. 
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8. Apparatus for simplifying the access of metadata, 
which are associated with a file comprising multi- 
media data or a part of said file, especially for de- 
scribing the content of said multimedia data and/or 
searching said file or file part among a plurality of 
files comprising multimedia data, wherein the meta- 
data are multiplexed with said multimedia data, 
characterized in means for extracting the metada- 
ta from the multimedia content multiplex; 
means for gathering the extracted metadata; 
means for analyzing and processing the gathered 
metadata to form metadata entities; 
means for amending the metadata entities by a ref- 
erence to the content itself; 

means for forming a descriptor stream from the re- 
sulting pairs of metadata entities and references to 
the content; 

means for storing said descriptor stream separately 
from the files comprising multimedia content. 
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